40 research outputs found

    A semiparametric approach for a multivariate sample selection model

    Get PDF
    International audienceMost of the common estimation methods for sample selection models rely heavily on parametric and normality assumptions. We consider in this paper a multivariate semiparametric sample selection model and develop a geometric approach to the estimation of the slope vectors in the outcome equation and in the selection equation. Contrary to most existing methods, we deal symmetrically with both slope vectors. Moreover, the estimation method is link-free and distributionfree. It works in two main steps: a multivariate sliced inverse regression step, and a canonical analysis step. We establish pn-consistency and asymptotic normality of the estimates. We describe how to estimate the observation and selection link functions. The theory is illustrated with a simulation study

    A new sliced inverse regression method for multivariate response

    Get PDF
    International audienceA semiparametric regression model of a q-dimensional multivariate response y on a p-dimensional covariate x is considered. A new approach is proposed based on sliced inverse regression (SIR) for estimating the effective dimension reduction (EDR) space without requiring a prespecified parametric model. The convergence at rate square root of n of the estimated EDR space is shown. The choice of the dimension of the EDR space is discussed. Moreover, a way to cluster components of y related to the same EDR space is provided. Thus, the proposed multivariate SIR method can be used properly on each cluster instead of blindly applying it on all components of y. The numerical performances of multivariate SIR are illustrated on a simulation study. Applications to a remote sensing dataset and to the Minneapolis elementary schools data are also provided. Although the proposed methodology relies on SIR, it opens the door for new regression approaches with a multivariate response

    Rotation orthogonale en ACP de données mixtes. Le package PCAmixdata et une application en sociologie culturelle.

    Get PDF
    Rotation orthogonale en ACP de données mixtes. Le package PCAmixdata et une application en sociologie culturelle

    A sliced inverse regression approach for block-wise evolving data streams

    No full text
    International audienc

    Clustering of Variables for Mixed Data

    No full text
    This chapter presents clustering of variables which aim is to lump together strongly related variables. The proposed approach works on a mixed data set, i.e. on a data set which contains numerical variables and categorical variables. Two algorithms of clustering of variables are described: a hierarchical clustering and a k-means type clustering. A brief description of PCAmix method (that is a principal component analysis for mixed data) is provided, since the calculus of the synthetic variables summarizing the obtained clusters of variables is based on this multivariate method. Finally, the R packages {\bf ClustOfVar} and {\bf PCAmixdata} are illustrated on real mixed data. The PCAmix (resp. ClustOfVar) approach is first used for dimension reduction (step1) before standard clustering of the individuals (step 2)

    Variable importance assessment in sliced inverse regression for variable selection

    No full text
    We are interested in treating the relationship between a dependentvariable yy and a multivariate covariate x∈Rpx \in {\R}^p in asemiparametric regression model. Since the purpose of most social,biological or environmental science research is the explanation, the determination of theimportance of the variables is a major concern. It is a way todetermine which variables are the most important when predictingyy. Sliced inverse regression methods allows to reduce the space of thecovariate xx by estimating the directions β\beta that form aneffective dimension reduction (EDR) space. The aim of this paper isto propose a computational method based on importance variable measure (only relying on the EDR space) in order to select the most useful variables. The numerical behavior of this new method, implemented in R, is studied on a simulation study. An illustration on a real data is also provided
    corecore